Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xz: Add --synchronous #159

Closed
wants to merge 2 commits into from
Closed

xz: Add --synchronous #159

wants to merge 2 commits into from

Conversation

Larhzu
Copy link
Member

@Larhzu Larhzu commented Dec 27, 2024

xz's default behavior is to delete the input file after successful compression or decompression (unless writing to standard output). If the system crashes soon after the deletion, it is possible that the newly written file has not yet hit the disk while the previous delete operation might have. In that case neither the original file nor the written file is available.

The --synchronous option makes xz call fsync() on the file and possibly the directory where the file was created. A similar option was added to GNU gzip 1.7 in 2016.

Larhzu and others added 2 commits December 27, 2024 09:15
xz's default behavior is to delete the input file after successful
compression or decompression (unless writing to standard output).
If the system crashes soon after the deletion, it is possible that
the newly written file has not yet hit the disk while the previous
delete operation might have. In that case neither the original file
nor the written file is available.

The --synchronous option makes xz call fsync() on the file and possibly
the directory where the file was created. A similar option was added to
GNU gzip 1.7 in 2016. There some differences in behavior:

  - When writing to standard output and processing multiple input files,
    xz calls fsync() after every file while gzip does so only after all
    files have been processed.

  - This has no effect on "xz --list". xz doesn't sync standard output
    in --list mode but gzip does.

Portability notes:

  - <libgen.h> and dirname() should be available on all POSIX systems,
    and aren't needed on non-POSIX systems.

  - fsync() is available on all POSIX systems. The directory syncing
    could be changed to fdatasync() although at least on ext4 it
    doesn't seem to make a performance difference in xz's usage.
    fdatasync() would need a build system check to support (old)
    special cases, for example, MINIX 3.3.0 doesn't have fdatasync()
    and Solaris 10 needs -lrt.

  - On native Windows, _commit() is used to replace fsync(). Directory
    syncing isn't done and shouldn't be needed. (In Cygwin, fsync() on
    directories is a no-op.) It is known that syncing will fail if
    writing to stdout and stdout isn't a regular file.

  - DJGPP has fsync() for files. ;-)

Co-authored-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://bugs.debian.org/814089
Link: https://www.mail-archive.com/[email protected]/msg00282.html
Closes: #151
Closes: #159
@Larhzu Larhzu closed this Jan 4, 2025
@Larhzu Larhzu deleted the synchronous branch January 5, 2025 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant